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ABSTRACT 

We discovered a novel Betacoronavirus lineage A coronavirus, China Rattus coronavirus 
HKU24 (ChRCoV HKU24), from Norway rats in China. ChRCoV HKU24 occupied a deep 
branch at the root of members of Betacoronavirus 1, being distinct from murine coronavirus and 
HCoV HKU1. Its unique putative cleavage sites at nsp1/2 and S, and low sequence identities to 
other lineage A BCoVs in conserved replicase domains, support ChRCoV HKU24 as a separate 
species. ChRCoV HKU24 possessed genome features that resemble both Betacoronavirus 1 and 
murine coronavirus, being closer to Betacoronavirus I in most predicted proteins, but closer to 
murine coronavirus by G+C content, a single NS4 and absent TRS for E. Its N-terminal domain 
(NTD) demonstrated higher sequence identity to BCoV than to MHV NTDs, with three of four 
critical sugar-binding residues in BCoV and two of 14 contact residues at MHV 
NTD/mCEACAM 1a interface being conserved. Molecular clock analysis dated the tMRCA of 
ChRCoV HKU24, Betacoronavirus 1 and RbCoV HKU14 to ~1400. Cross reactivities were 
demonstrated between other lineage A and B BCoVs and ChRCoV HKU724 nucleocapsid but not 
spike polypeptide. Using the spike polypeptide-based western blot, we showed that only Norway 
rats and two Oriental house rats from Guangzhou were infected by ChRCoV HKU24. Other rats, 
including Norway rats from Hong Kong, only possessed antibodies against N protein but not 
spike, suggesting infection by BCoVs different from ChRCoV HKU24. ChRCoV HKU24 may 
represent the murine origin of Betacoronavirus I and rodents are likely an important reservoir 


for ancestors of lineage A BCoVs. 
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IMPORTANCE 

While bats and birds are hosts for ancestors of most coronaviruses (CoVs), lineage A BCoVs 
have never been found in these animals and the origin of Betacoronavirus lineage A remains 
obscure. We discovered a novel lineage A BCoV, China Rattus coronavirus HKU24 (ChRCoV 
HKU24), from Norway rats in China, with a high seroprevalence. The unique genome features 
and phylogenetic analysis supported that ChRCoV HKU24 represents a novel CoV species, 
occupying a deep branch at the root of members of Betacoronavirus I and distinct from murine 
coronavirus. Nevertheless, ChRCoV HKU24 possessed genome characteristics that resemble 
both Betacoronavirus 1 and murine coronavirus. Our data suggest that ChRCoV HKU24 
represents the murine origin of Betacoronavirus 1, with interspecies transmission from rodents to 
other mammals having occurred centuries ago before the emergence of HCoV OC43 in late 


1800s. Rodents may be an important reservoir for ancestors of lineage A BCoVs. 
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INTRODUCTION 

Coronaviruses (CoVs) infect a wide variety of animals including humans, causing respiratory, 
enteric, hepatic and neurological diseases of varying severity. Based on genotypic and 
serological characterization, CoVs were traditionally classified into three distinct groups (1, 2). 
Recently, the Coronavirus Study Group of the International Committee for Taxonomy of Viruses 
(ICTV) has revised the nomenclature and taxonomy to re-classify the three CoV groups into 
three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus (3). Novel CoVs, 
which represented a novel genus, Deltacoronavirus, have also been identified (4-6). As a result 
of the ability to use a variety of host receptors and evolve rapidly through mutation and 
recombination, CoVs are capable to adapt to new hosts and ecological niches, causing wide 
spectra of diseases (2, 7-12). 

The severe acute respiratory syndrome (SARS) epidemic and identification of SARS- 
CoV-like viruses from palm civet and horseshoe bats in China has boosted interests in the 
discovery of novel CoVs in both humans and animals (13-20). It is now known that CoVs from 
all four genera can be found in mammals. Historically, alphacoronaviruses (aCoVs) and 
betacoronaviruses (BCoVs) are found in mammals while gammacoronaviruses (yCoVs) were 
found in birds. However, recent findings suggested the presence of yCoVs also in mammals (5, 
21, 22). Although deltacoronaviruses (6CoVs) were also mainly found in birds, potential 
mammalian 6CoVs have been reported (4, 23). In particular, a 6CoVs closely related to sparrow 
CoV HKU17, porcine CoV HKUIS, has been identified in pigs, which suggested avian-to- 
mammalian transmission (4). Based on current findings, a model for CoV evolution was 


proposed, where bat CoVs are likely the gene source of Alphacoronavirus and Betacoronavirus, 
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and avian CoVs are the gene source of Gammacoronavirus and Deltacoronavirus (4). However, 
one notable exception to this model is Betacoronavirus lineage A. 

The genus Betacoronavirus consists of four lineages, A to D. While human coronavirus 
OC43 (HCoV OC43) and human coronavirus HKU1 (HCoV HKU1) belong to Betacoronavirus 
lineage A (20, 24-27), SARS coronavirus (SARS-CoV) belongs to Betacoronavirus lineage B 
and the recently emerged, Middle East Respiratory syndrome coronavirus (MERS-CoV) belongs 
to Betacoronavirus lineage C. No human CoV has yet been identified from Betacoronavirus 
lineage D. On the other hand, besides Alphacoronavirus, diverse bat CoVs have been found in 
Betacoronavirus lineage B (e.g. SARS-related Rhinolophus bat CoVs), lineage C (e.g. 
Tylonycteris bat CoV HKU4 and Pipistrellus bat CoV HKUS) and lineage D (e.g. Rousettus bat 
CoV HKUS9) (8, 14, 15, 28-37), supporting that bat CoVs are likely the ancestral origin of other 
mammalian CoVs in these lineages. However, no bat CoVs belonging to Betacoronavirus 
lineage A have yet been identified, despite the numerous surveillance studies on bat CoVs 
conducted in various countries over the years (38). Therefore, the ancestral origin of the 
mammalian lineage A BCoVs, such as HCoV OC43 and HCoV HKU1, remains obscure. 

While HCoV OC43 is likely to have originated from zoonotic transmission, sharing a 
common ancestor with bovine coronavirus (BCoV) dated back to 1890 (27, 30, 39), closely 
related CoVs belonging to the same species, Betacoronavirus 1, have also been found in various 
mammals including pigs, horses, dogs, waterbucks, sable antelope, deer, giraffes, alpaca and 
dromedary camels, suggesting a common ancestor in mammals with subsequent frequent 
interspecies transmission (40-47). Although no zoonotic origin of HCoV HKUI1 has been 
identified, the virus is most closely related to mouse hepatitis virus (MHV) and rat coronavirus 


(RCoV) which, together, are now classified as murine coronavirus (3, 20, 42). We therefore 
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hypothesize that rodent CoVs are the ancestral origin of Betacoronavirus lineage A. In this study, 
we tested samples from various rodent species in Hong Kong and southern China for the 
presence of lineage A BCoVs. A novel CoV, China Rattus coronavirus HKU24 (ChRCoV 
HKU24), was discovered from Norway rats in Guangzhou. Complete genome analysis showed 
that ChRCoV HKU24 represents a novel species within Betacoronavirus lineage A, but 
possessed features that resemble both Betacoronavirus I and murine coronavirus. High 
seroprevalence was also demonstrated among Norway rats from Guangzhou using western blot 
analysis against ChRCoV HKU24 recombinant N protein and spike polypeptide. The present 
results suggest that ChRCoV HKU24 likely represents the murine origin of Betacoronavirus 1 


and provides insights on the ancestor of Betacoronavirus lineage A. 
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MATERIALS AND METHODS 
Sample collection. All rodent samples were collected from January 2010 to August 2012 using 
procedures described previously (5, 14). Samples from southern China were collected from 
animal markets or restaurants. Samples from Hong Kong were collected from wild and street 
rodents by the Agriculture, Fisheries and Conservation Department, and Food and 
Environmental Hygiene Department of the Hong Kong Special Administrative Region (HKSAR) 
respectively. Alimentary samples were placed in viral transport medium containing Earle's 
balanced salt solution (Invitrogen, New York, United States), 20% glucose, 4.4% NaHCO3, 5% 
bovine albumin, 50000 ug/ml vancomycin, 50000 ug/ml amikacin, 10000 units/ml nystatin, 
before transportation to the laboratory for RNA extraction. The study was approved by the 
Committee on the Use of Live Animals for Teaching and Research, The University of Hong 
Kong. 

RNA extraction. Viral RNA was extracted from the samples using QIAamp Viral RNA 
Mini Kit (Qiagen, Hilden, Germany). The RNA was eluted in 60 pl of Buffer AVE and was used 
as the template for RT-PCR. 

RT-PCR of RdRp gene of CoVs using conserved primers and DNA sequencing. 
Initial CoV screening was performed by amplifying a 440-bp fragment of the RNA-dependent 
RNA polymerase (RdRp) gene of CoVs_ using conserved’ primers (5’- 
GGTTGGGACTATCCTAAGTGTGA-3’ and 5’-CCATCATCAGATAGAATCATCATA-3’) 
designed by multiple alignments of the nucleotide (nt) sequences of available RdRp genes of 
known CoVs (14, 20). Reverse transcription was performed using SuperScript III kit (Invitrogen, 
San Diego, CA, USA). The PCR mixture (25 pl) contained cDNA, PCR buffer (10 mM Tris-HCl] 


pH 8.3, 50 mM KCl, 2 mM MgCl) and 0.01% gelatin), 200 uM of each dNTPs and 1.0 U Tag 
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polymerase (Applied Biosystems, Foster City, CA, USA). The mixtures were amplified in 60 
cycles of 94°C for 1 min, 50°C for 1 min and 72°C for | min and a final extension at 72°C for 10 
min in an automated thermal cycler (Applied Biosystems, Foster City, CA, USA). Standard 
precautions were taken to avoid PCR contamination and no false-positive was observed in 
negative controls. 

PCR products were gel-purified using the QIAquick gel extraction kit (Qiagen, Hilden, 
Germany). Both strands of the PCR products were sequenced twice with an ABI Prism 3700 
DNA Analyzer (Applied Biosystems, Foster City, CA, USA), using the two PCR primers. The 
sequences of the PCR products were compared with known sequences of the RdRp genes of 
CoVs in the GenBank database. 

Viral culture. The three rodent samples positive for ChRCoV HKU24 by RT-PCR were 
subject to virus isolation in Huh-7.5 (human hepatoma), Vero E6 (African green monkey 
kidney), HRT-18G (human rectum epithelial), BSC-1 (African green monkey renal epithelial), 
RK13 (rabbit kidney), MDBK (bovine kidney), NIH/3T3 (mouse embryonic fibroblast), J774 
(mouse macrophage), BHK-21 (baby hamster kidney) and RK3E (rat kidney), RMC (rat kidney 
mesangial), RAW264.7 (mouse macrophage) and primary SD rat lung cells as described 
previously (48, 49). 

Real-time RT-PCR quantitation. Real-time RT-PCR was performed on rodent samples 
positive for ChRCoV HKU24 by RT-PCR using previously described procedures (14). Reverse 
transcription was performed using the SuperScript III kit with random primers (Invitrogen, San 
Diego, CA, USA). cDNA was amplified in Lightcycler instrument with a FastStart DNA Master 
SYBR Green I Mix reagent kit (Roche Diagnostics GmbH, Mannheim, Germany) using specific 


primers 5’-ACAGGTTCTCCCTTTATAGATGAT-3’) and (5’- 
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TCTCCTGTATAGTAGCAGAAGCAT-3’) targeting the RdRp gene of ChRCoV HKU24 using 
procedures described previously (14, 50). For quantitation, a reference standard was prepared 
using pCRU-TOPO vector (Invitrogen, San Diego, CA, USA) containing the target sequence. 
Tenfold dilutions equivalent to 3.77 to 3.77x10° copies per reaction were prepared to generate 
concomitant calibration curves. At the end of the assay, PCR products (133-bp fragment of 
RdRp) were subjected to melting curve analysis (65—95°C, 0.1°C/s) to confirm the specificity of 
the assay. The detection limit of this assay was 3.77 copies per reaction. 

Complete genome sequencing. Three complete genomes of ChRCoV HKU24 were 
amplified and sequenced using the RNA extracted from the original alimentary samples as 
templates. The RNA was converted to cDNA by a combined random-priming and oligo(dT) 
priming strategy. The cDNA was amplified by degenerate primers designed by multiple 
alignments of the genomes of other CoVs with complete genomes available, using strategies 
described in our previous publications (14, 20, 35, 49) and the CoV database, CoVDB (51), for 
sequence retrieval. Additional primers were designed from the results of the first and subsequent 
rounds of sequencing. These primer sequences are available on request. The 5’ ends of the viral 
genomes were confirmed by rapid amplification of cDNA ends using the 5'/3' RACE kit (Roche 
Diagnostics GmbH, Mannheim, Germany). Sequences were assembled and manually edited to 
produce final sequences of the viral genomes. 

Genome analysis. The nt sequences of the genomes and the deduced amino acid (aa) 
sequences of the open reading frames (ORFs) were compared to those of other CoVs with 
available complete genomes using the CoVDB (51). Phylogenetic tree construction was 
performed using maximum likelihood method using PhyML, with bootstrap values calculated 


from 100 trees. Protein family analysis was performed using PFAM and InterProScan (52, 53). 
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Prediction of transmembrane domains was performed using TMHMM (54). The structure of 
ChRCoV HKU24 N-terminal domain (NTD) was predicted using a web-based homology- 
modelling server, SWISS-MODEL. BLASTp search was performed against Protein Data Bank 
(PDB) with the default parameters to find suitable templates for homology modelling. Based on 
the higher sequence identity, QMEAN Z-score, coverage and lower e-value, crystal structure of 
the BCoV NTD (PDB code: 4h14) was selected as template. The predicted structure was 
visualized using Jmol. 

Estimation of divergence dates. Divergence time was calculated based on complete 
RdRp and HE gene sequence data using a Bayesian Markov Chain Monte Carlo (MCMC) 
approach as implemented in BEAST (version 1.8.0) as described previously (49, 55, 56). One 
parametric model (Constant Size) and one nonparametric model (Bayesian Skyline) tree priors 
were used for inference. Analyses were performed under SRD06 model, and using both a strict 
and a relaxed molecular clock. MCMC run was 2 x 108 steps long with sampling every 1,000 
steps. Convergence was assessed on the basis of effective sampling size after a 10% burn-in 
using Tracer software, version 1.5 (55). The mean time of the most recent common ancestor 
(tMRCA) and the highest posterior density regions at 95% (HPDs) were calculated, and the best- 
fitting models were selected by a Bayes factor using marginal likelihoods implemented in Tracer 
(56). Bayesian skyline under a relaxed-clock model with an uncorrelated exponential distribution 
was adopted for making inferences, as Bayes factor analysis for the RdRp and HE genes 
indicated that this model fitted the data better than other models tested. The tree was summarized 
in a target tree by the Tree Annotator program included in the BEAST package by choosing the 
tree with the maximum sum of posterior probabilities (maximum clade credibility) after a 10% 


burn-in. 
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Cloning and purification of (His)s-tagged recombinant ChRCoV HKU24 


nucleocapsid protein and spike polypeptide. To produce fusion plasmids for protein 


purification, primers 5’°-CTAGCTAGCATGTCTCATACGCCA-3’ and 5’- 
CTAGCTAGCTTATATTTCTGAGCTTCCC 3’, and 5’- 
CTAGCTAGCCAACCAATAGCAGATGTGTA-3’ and 5’- 


CTAGCTAGCTTATCTCTTGGCTCGCCATGT-3’, were used to amplify the nucleocapsid 
gene and a partial S1 fragment encoding amino acid residues 317 to 763 of the spike protein of 
ChRCoV HKU?24 respectively as described previously (31, 49, 57, 58). The sequences, coding 
for a total of 443 aa and 447 aa residues respectively, were amplified and cloned into the Nhel 
site of expression vector pET-28b(+) (Merck, KGaA, Darmstadt, Germany) in frame and 
downstream of the series of six histidine residues. The (His)¢-tagged recombinant nucleocapsid 
protein and spike polypeptide were expressed and purified using the Ni-NTA affinity 
chromatography (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. 
Western blot analysis. To detect the presence of antibodies against ChRCoV HKU24 N 
protein and spike polypeptide in rodent sera and to test for possible cross antigenicity between 
ChRCoV HKU24 and other BCoVs, 600 ng of purified (His)¢-tagged recombinant N protein or 
spike polypeptide of ChRCoV HKU24 was loaded into the well of a sodium dodecyl sulfate 
(SDS)—10% polyacrylamide gel and subsequently electroblotted onto a nitrocellulose membrane 
(Bio-Rad, Hercules, CA, USA). The blot was cut into strips and the strips were incubated 
separately with 1:2000, 1:4000 or 1:8000 dilutions of sera collected from rodents with serum 
samples available, human sera from two patients with HCoV OC43 infection, sera from two 
rabbits with RbCoV HKU14 and human sera from two patients with SARS-CoV infection 


respectively. Antigen-antibody interaction was detected with 1:4000 horse radish peroxidase- 
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228 conjugated anti-rat IgG, anti-human IgG or anti-rabbit IgG (Zymed) and ECL fluorescence 
229 system (GE Healthcare Life Sciences, Little Chalfont, UK) as described previously (14, 58). 

230 Nucleotide sequence accession numbers. The nt sequences of the three genomes of 
231 ChRCoV HKU24 have been lodged within the GenBank sequence database under accession no. 
232 KM349742-KM349744. 
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RESULTS 

Identification of a novel CoV from Norway rats in China. Of 91 alimentary samples from 
rodents in China, RT-PCR for a 440-bp fragment in the RdRp gene of CoVs was positive for a 
potentially novel CoV in three samples from Norway rats (Rattus norvegicus) from a restaurant 
in Guangzhou (Table 1). None of the 573 alimentary samples from rodents in Hong Kong, 
including those from Norway rats, was positive for CoVs. Sequencing results suggested that the 
potentially novel virus was most closely related to MHV with <85% nt identities, and members 
of the species Betacoronavirus I including HCoV OC43, BCoV, equine coronavirus (ECoV) and 
porcine hemagglutinating encephalomyelitis virus with <84% nt identities. Quantitative RT-PCR 
showed that the viral load in the positive samples ranged from 1.2x10° to 1.3x10° copies/g. 
Attempts to stably passage ChRCoV HKU24 in cell cultures were unsuccessful, with no 
cytopathic effect or viral replication being detected. 

Genome organization and coding potential of ChRCoV HKU24. Complete genome 
sequence data of three strains of ChRCoV HKU24 were obtained by assembly of the sequences 
of RT-PCR products from the RNA directly extracted from the corresponding individual 
specimens. The three genomes shared >99% nt sequence similarity. Their genome size was 
31234 bases, with the G + C content (40%) closer to that of murine coronavirus than to that of 
Betacoronavirus I (Table 2). The genome organization is similar to that of other lineage A 
BCoVs, with the characteristic gene order 5’-replicase ORFlab, haemagglutinin-esterase (HE), 
spike (S), envelope (E), membrane (M), nucleocapsid (N)-3’ (Table 2 and Fig. 1). Moreover, 
additional ORFs coding for non-structural proteins, NS2a, NS4, NS5 and N2, are found. A 
putative transcription regulatory sequence (TRS) motif, 5’°-CUAAAC-3’, similar to that of 


aCoVs and the motif, 5’°-UCUAAAC-3’, in other lineage A BCoVs, was identified at the 3’ end 
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of the leader sequence and precedes each ORF except NS4, E and N2 genes (Table 3) (26, 49, 
59-61). However, there were base mismatches for HE and NSS, with an alternative TRS motif, 
5’-CUGAAC-3’ and 5’-GUAAAC-3’ respectively. 

The coding potential and characteristics of putative non-structural proteins (nsps) of 
ORFI1 of ChRCoV HKU24 were shown in Tables 3 and 4. The ORF1 polyprotein possessed 
68.6-75.0% aa identities to the polyproteins of other lineage A BCoVs. It possessed a unique 
putative cleavage site, G/L, between nsp1 and nsp2, in contrast to G/V found in other lineage A 
BCoVs except HCoV HKU1 with G/I (Table 4 and Fig. 1). Other predicted cleavage sites were 
mostly conserved between ChRCoV HKU24 and other lineage A BCoVs. However, the lengths 
of nspl, nsp2, nsp3, nsp13, nsp15 and nspl6 in ChRCoV HKU24 differed from those of 
corresponding nsps in members of Betacoronavirus 1 and murine coronavirus, as a result of 
deletions or insertions. 

All lineage A BCoVs, except HCoV HKU1, possess NS2a gene between ORF lab and HE. 
Unlike RbCoV HKU14 with the NS2a broken into several small ORFs (49), ChRCoV HKU24 is 
predicted to possess a single NS2a protein as in other lineage A BCoVs. This NS2a protein 
displayed 43.7-62.0% aa identities to those of Betacoronavirus I and 45.7-47.3% aa identities to 
those of murine coronavirus. Although the BCoV-specific NS2 protein has been shown to be 
non-essential for in vitro viral replication (62), cyclic phosphodiesterase domains have been 
predicted in the NS2 proteins of some CoVs and toroviruses, and a possible role in viral 
pathogenicity has been suggested in MHV (63, 64). In contrast to MHV and RCoV, such domain 
was not found in ChRCoV HKU24. 

Similar to other CoV S protein, the S of ChRCoV HKU24 is predicted to be a type I 


membrane glycoprotein, with most of the protein (residues 16-1302) exposed on the outside of 
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the virus and with a transmembrane domain (residues 1303-1325) at the C terminus (Fig. 2). 
Two heptad repeats (HR), important for membrane fusion and viral entry, were located at 
residues 1045-1079 (HR1) and 1253-1285 (HR2). The S protein of ChRCoV HKU?24 possessed 
66.7-69.6% aa identities to those of members of Betacoronavirus 1 and 62.4-64.3% identities to 
those of members of murine coronavirus. The aa sequence identity between the ChRCoV 
HKU24 NTD and BCoV and MHV NTDs was 61 and 56%, respectively. BCoV and HCoV 
OC43 utilize N-acetyl-9-O acetyl neuramic acid as receptor for initiation of infection (65, 66). In 
contrast, MHV utilizes carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1) 
as receptor and its receptor-binding domain does not bind sugars (10, 67, 68). Recent structural 
studies showed that, among the four critical sugar-binding residues in BoV, a Glu-—Gly 
substitution was found in one residue in MHV, which may explain the reduction in sugar- 
binding affinity. In ChRCoV HKU24, a Glu—Ser substitution is found at this position (Fig. 2). 
Comparison of the aa sequences between the S proteins of ChaRCoV HKU24 and MHV showed 
that ChRCoV HKU24 possessed many aa substitutions in the region corresponding to the MHV 
NTD (Fig. 2). In particular, 12 of the 14 important contact residues at the MHV 
NTD/mCEACAM 1a interface were not conserved between ChRCoV HKU24 and MHV. Similar 
to the MHV and BCoV NTDs, the ChRCoV HKU24 NTD is also predicted to contain a core 
structure with B-sandwich fold as human galectins (galactose-binding lectins) using homology 
modelling (10). Modelling showed that the B-sandwich core structure of ChRCoV HKU24 
consists of one six-stranded B-sheet and one seven-stranded B-sheet that are stacked together 
through hydrophobic interactions (Fig. 2). In addition, the S of ChRCoV HKU24 possessed a 


unique predicted cleavage site, RAKR, among lineage A BCoVs. 
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Other predicted domains in HE, S, NS4, NS5, E, M and N proteins of ChRCoV HKU24 
are summarized in Table 3 and Fig. 1. The NS4 of ChRCoV HKU24 shared 37-42% aa identity 
to the NS4 proteins of murine coronavirus. In most members of Betacoronavirus 1, the NS4 is 
split into smaller proteins. The NSS of ChRCoV HKU24 is homologous to the NS5/NS5a of 
members of Betacoronavirus I with 47.7% to 51.4% aa identities, but to the NS5 of MHV with 
only 39.5% aa identity. Interestingly, NS5 is not found in the genome of RCoV. The absence of a 
preceding TRS upstream of the E of ChRCoV HKU24 suggests that the translation of this E 
protein may be cap-independent, via an internal ribosomal entry site (IRES), as demonstrated in 
MHV (69). Similarly, the E of RCoV and HCoV HKU1 was also not preceded by TRS. This is 
in contrast to members of Betacoronavirus I which possess a preceding TRS upstream of their E 
proteins (49, 61). Downstream to N gene, the 3’-untranslated region contains a predicted bulged 
stem-loop structure of 69 nt (nt position 30944-31012) that is conserved in BCoVs (70). 
Overlapping with the bulged stem-loop structure by 5 nt, a conserved pseudoknot structure (nt 
position 31008-31059) that is important for CoV replication is found. Since non-structural 
proteins in CoVs may possess unique function for replication and virulence (71, 72), further 
studies are warranted to understand the potential function of the nsps and NS proteins in 
ChRCoV HKU?24. 

Phylogenetic analyses. Phylogenetic trees constructed using the aa sequences of RdRp, 
S and N proteins of ChRCoV HKU24 and other CoVs are shown in Fig. 3, and the 
corresponding pairwise aa identities shown in Table 2. For all three genes, the three ChRCoV 
HKU 24 strains formed a distinct cluster among lineage A BCoVs, occupying a deep branch at the 
root of and being most closely related to members of the species Betacoronavirus 1. Comparison 


of the aa sequences of the seven conserved replicase domains, ADRP, nsp5 (3CL?”), nsp12 
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(RdRp), nsp13 (Hel), nsp14 (ExoN), nsp15 (NendoU) and nsp16 (O-MT), for CoV species 
demarcation (3) showed that ChRCoV HKU24 possessed 69.5-81.7%, 82.2-86.8%, 88.1-92.6%, 
88.9-94.8%, 80.2-88.7%, 70.1-79.5% and 83.8-89.7% aa identities to other lineage A BCoVs 
respectively (Table 5). Based on the present results, we propose a novel species, ChRCoV 
HKU24, to describe this virus under Betacoronavirus lineage A and distinguish it from RCoV. 

HE proteins are glycoproteins that mediate reversible attachment to O-acetylated sialic 
acids by acting as both lectins and receptor-destroying enzymes which aid viral detachment from 
sugars on infected cells (68, 73). Related HEs have been found in influenza C viruses, 
toroviruses and lineage A BCoVs, but not other CoVs. It has been suggested that HEs of lineage 
A BCoVs have arisen from an influenza C-like HE fusion protein, likely as a result of relatively 
recent lateral gene transfer events (73). Phylogenetic analysis of the HE proteins of lineage A 
BCoVs, toroviruses and influenza C viruses showed that they fell into three separate clusters (Fig. 
3). The HE of ChRCoV HKU24 also forms a deep branch at the root of members of the species 
Betacoronavirus I except ECoV and is distinct from members of murine coronavirus. Previous 
studies have demonstrated heterogeneity of gene expression of HE proteins among different 
MHYV strains (74). Since the HE of ChRCoV HKU724 is not preceded by a perfectly matched 
TRS, further studies are required if it is expressed and functional. 

Estimation of divergence dates. Using the uncorrelated relaxed clock model on 
complete RdRp gene sequences, the date of tMRCA of ChRCoV HKU24, members of 
Betacoronavirus I and RbCoV HKU14 was estimated to be 1402 (HPDs, 918.05 to 1749.91) 
(Fig. 4). The date of divergence between HCoV OC43 and BCoV was estimated to be 1897 
(HPDs, 1826.15 to 1950.05), consistent with results from previous molecular clock studies (27). 


Using the uncorrelated relaxed clock model on complete HE gene sequences, the date of tMRCA 
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of ChRCoV HKU24, members of Betacoronavirus 1 and RbCoV HKU14 was estimated to be 
1337 (HPDs, 724.59 to 1776.78) (Fig. 4). The date of divergence between HCoV OC43 and 
BCoV was estimated to be 1871 (HPDs, 1764.55 to 1944.37). The estimated mean substitution 
rates of the RdRp and HE data set were 1.877x10“ and 4.016x10~ substitution per site per year 
respectively, which are comparable to previous estimation in other lineage A BCoVs (26, 27, 39). 

Serological studies. Western blot analysis using recombinant ChRCoV HKU24 N 
protein was performed using sera from 144 rodents with serum samples available, human sera 
from two patients with HCoV OC43 infection, sera from two rabbits with RbCoV HKU14 and 
human sera from two patients with SARS-CoV infection. Among tested sera from 74 Norway 
rats from Guangzhou with serum samples available, 60 (81.1%) were positive for antibody 
against recombinant ChRCoV HKU24 N protein with prominent immunoreactive bands of about 
50 kDa (Table | and Fig. 5). These 60 positive samples include three serum samples collected 
from the three Norway rats positive for ChRCoV HKU24 in their alimentary samples. In 
addition, 15 (48.4%) of 31 Norway rats from Hong Kong were also positive for antibody against 
recombinant ChRCoV HKU24 N protein, although the virus was not detected in alimentary 
samples from these rats. Moreover, seven (77.8%) of nine oriental house rats but only four 
(0.13%) of 30 black rats were positive for antibody against recombinant ChRCoV HKU24 N 
protein. Possible cross antigenicity between ChRCoV HKU24 and other BCoVs, including 
lineage A and B BCoVs, was found. Human sera from two patients with HCoV OC43 infection, 
sera from two rabbits with RbCoV HKU14 infection and human sera from two patients with 
SARS-CoV infection were also positive for antibody against recombinant ChRCoV HKU24 N 


protein by western blot assay (Fig. 5). 
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Western blot analysis using recombinant ChRCoV HKU24 spike polypeptide was 
performed to verify the specificity of antibodies against ChRCoV HKU24 N protein using 
positive rodent sera and human sera from two patients with HCoV OC43 infection, sera from 
two rabbits with RoCoV HKU14 and human sera from two patients with SARS-CoV infection. 
Among sera from the 60 Norway rats with positive antibodies against ChRCoV HKU24 N 
protein, 21 were positive for antibodies against ChRCoV HKU24 spike polypeptide with 
prominent immunoreactive bands of about 50 kDa (Table | and Fig. 5). However, serum samples 
from the three Norway rats positive for ChRCoV HKU24 in their alimentary samples were 
negative for anti-ChRCoV HKU24 spike polypeptide antibody. Of the seven oriental house rats 
with positive antibodies against ChRCoV HKU24 N protein, two were positive for antibodies 
against ChRCoV HKU724 spike polypeptide. However, serum samples from the four black rats 
and 15 Norway rats from Hong Kong with positive antibodies against ChRCoV HKU24 N 
protein were negative for antibodies against ChRCoV HKU24 spike polypeptide. In contrast to N 
protein, no cross antigenicity was detected between ChRCoV HKU24 spike polypeptide and 
positive sera against other BCoVs, including lineage A and B BCoVs. Human sera from two 
patients with HCoV OC43 infection, sera from two rabbits with RbCoV HKU14 infection and 
human sera from two patients with SARS-CoV infection were all negative for antibody against 


recombinant ChRCoV HKU?24 spike polypeptide by western blot assay (Fig. 5). 
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DISCUSSION 

We discovered a novel lineage A BCoV, ChRCoV HKU24, from Norway rats in southern China. 
Betacoronavirus lineage A comprises the traditional “group 2 CoVs” including members of 
murine coronavirus and Betacoronavirus 1, HCoV HKU1 and RbCoV HKU14. ChRCoV 
HKU724 possessed <90% aa identities to all other lineage A BCoVs in five of the seven conserved 
replicase domains for CoV species demarcation by ICTV (3), supporting that ChaRCoV HKU24 
belongs to a separate species. The genome of ChRCoV HKU24 also possesses features distinct 
from those of other lineage A BCoVs, including a unique putative nsp|/nsp2 cleavage site and a 
unique putative cleavage site in S protein. Phylogenetically, its position at the root of 
Betacoronavirus 1, being distinct from murine coronavirus and HCoV HKU1, suggested that 
ChRCoV HKU24 may represent the murine ancestor for Betacoronavirus 1, after branching off 
from the common ancestor of murine coronavirus and HCoV HKUI1. Interestingly, the genome 
of ChRCoV HKU24 possessed features that resemble both Betacoronavirus I] and murine 
coronavirus. It is more similar to Betacoronavirus I] than murine coronavirus by the higher 
sequence identities in most predicted proteins including NS2a, NSS and S. On the other hand, it 
is more similar to murine coronavirus than to Betacoronavirus I in terms of its G + C content, 
the presence of a single NS4 and absence of TRS upstream of E gene. Therefore, it is most likely 
that ChRCoV has evolved from the ancestor of murine coronavirus to infect other mammals, 
resulting in the generation of Betacoronavirus 1 with the acquisition of TRS for E gene. The 
tMRCA of ChRCoV HKU24, members of Betacoronavirus 1 and RoCoV HKU14 was estimated 
to be 1402 (HPDs, 918.05 to 1749.91) and 1337 (HPDs, 724.59 to 1776.78) using complete 


RdRp and HE gene analysis respectively, suggesting that interspecies transmission from rodents 
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to other mammals occurred at least several centuries ago before the emergence of HCoV OC43 
in humans at approximately1890s . 

Western blot assays based on recombinant ChRCoV HKU24 N protein and spike 
polypeptide showed a high seroprevalence of ChRCoV HKU24 infection among Norway rats 
from Guangzhou. We evaluated cross reactivities of both N protein and spike polypeptide assays 
using sera from infections by other lineage A BCoVs, HCoV OC43 in humans and RbCoV 
HKU1/4 in rabbits, as well as SARS-CoV, a lineage B BCoV. Cross-reacting antibodies against N 
proteins were observed, which is in line with previous findings on cross-reactivity between N 
proteins of different BCoVs (49, 57). In contrast, no cross reactivities were detected against spike 
polypeptide, supporting the specificity of CoV spike polypeptide-based assays and their ability to 
rectify cross reactivities (57, 58). Using the present assays, 60 of 74 Norway rats from 
Guangzhou were positive for antibodies against ChRCoV HKU24 N protein, among which 21 
were positive for antibodies ChRCoV HKU24 spike polypeptide, supporting past infections by 
ChRCoV HKU24 in these 21 rats. Interestingly, the three Norway rats positive for ChRCoV 
HKU24 in their alimentary samples were positive for antibodies against ChRCoV HKU24 N 
protein but negative for antibodies against ChRCoV HKU24 spike polypeptide. This is likely due 
to delay in mounting neutralizing antibodies against spike protein during acute infection in these 
three rats, while antibodies against N protein may rise earlier as a result of the high abundance 
and antigenicity of CoV N proteins or may be a result of cross-reactions from other BCoVs. The 
finding is also in keeping with previous findings on SARS-related Rhinolophus bat CoV that 
negative correlation was observed between viral load and neutralizing antibody (14). Besides 
Norway rats, antibodies against ChaRCoV HKU24 N protein and spike polypeptide were also 


detected in two oriental house rats from Guangzhou, although antibodies against spike 
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polypeptide were relatively weak. This suggests possible cross-species infection of ChRCoV 
HKU24 or cross reactivity from a very close lineage A BCoV. Four black rats and 15 Norway 
rats in Hong Kong were also positive for antibodies against ChRCoV HKU24 N protein but not 
spike polypeptide. This suggests possible past infection by other BCoV(s) with cross-reactivities 
between their N proteins and that of ChRCoV HKU24. More studies on diverse rodent species 
from China and other countries are required to determine the natural reservoir and host range of 
ChRCoV HKU?24 and other murine lineage A BCoVs. 

The present results extend our knowledge on the evolutionary origin of CoVs. While 
birds are important sources for yCoVs and 6CoVs, bats host diverse aCoVs and BCoVs that may 
be the ancestral origins of various mammalian CoVs including human CoVs. For human aCoVs, 
both HCoV NL63 and HCoV 229E were likely to be originated from bat CoVs. HCoV NL63 has 
been shown to share common ancestry with aCoVs from North American tricolored bat, with the 
most recent common ancestor between these viruses occurring approximately 563 to 822 years 
ago (75). Moreover, immortalized lung cell lines derived from this bat species allowed 
replication of HCoV NL63, supporting potential zoonotic-reverse zoonotic transmission cycles 
between bats and humans. HCoV 229E also shared a common ancestor with diverse aCoVs from 
leaf-nosed bats in Ghana, with the most recent common ancestor dated to 1686-1800 (76). 
However, no complete genomes are available for the putative bat ancestors of HCoV NL63 and 
HCoV-229E. For human BCoVs, SARS-CoV and MERS-CoV are also known to share common 
ancestors with bat CoVs. Soon after the SARS epidemic, horseshoe bats in China were found to 
be the reservoir for SARS-CoV-like viruses, which were postulated to have jumped from bats to 
civet and later humans (8, 14, 15). A recent study also reported the isolation of a SARS-like bat 


CoV in Vero E6 cells, and the ability of this bat virus to use the angiotensin-converting enzyme 2 
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(ACE2) from humans, civets and Chinese horseshoe bats for cell entry (77). MERS-CoV belongs 
to Betacoronavirus lineage C which was only known to consist of two bat viruses, Tylonycteris 
bat CoV HKU4 and Pipistrellus bat CoV HKUS, before the MERS epidemic (35-37). This has 
led to the speculation that bats may be the zoonotic origin of MERS-CoV. However, recent 
evidence supported dromedary camels as the immediate source of human MERS-CoV (78-80). 
Nevertheless, a conspecific virus from a South African Neoromicia capensis bat has been found 
to share 85% nt identity to MERS-CoV genome, suggesting acquisition of MERS-CoV by 
camels from bats in Sub-Saharan Africa from where camels on the Arabian peninsula are 
imported (81). In contrast, there has been no evidence for bats as the origin of human lineage A 
BCoVs such as HCoV OC43 and HCoV HKU1. HCoV OC43, being closely related to BCoV, is 
believed to have emerged relatively recently from bovine-to-human transmission at around 1890 
(27, 30, 39). Both viruses belonged to the promiscuous CoV species, Betacoronavirus 1, which 
consists of many closely related mammalian CoVs, implying a low threshold for cross- 
mammalian species transmission and a complex evolutionary history among these viruses (40-47, 
49). However, the ancestral origin of members of Betacoronavirus 1 remains elusive. As for 
HCoV HKUI1, no recent zoonotic ancestor has yet been identified, although the virus is most 
closely related to members of murine coronavirus (20, 42). Although rodents constitute 
approximately 40% of all mammalian species, murine coronavirus has been the only CoV 
species known to exist in rodents. This is in contrast to the large diversity of CoVs found in bats 
which make up another 20% of all species of mammals (6, 33, 36). The present results suggest 
that rodents may be an important reservoir for lineage A BCoVs and may harbor other ancestral 
viruses of Betacoronavirus 1 and HCoV HKU! (Fig. 6). Nevertheless, many mysteries remain 


unresolved in the evolution of lineage A BCoVs, such as the origin of their HE proteins. For 
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example, both toroviruses and influenza C viruses can be found in bovine and porcine samples. 
Further studies are required to determine if the HE of potential rodent CoV ancestors of 
Betacoronavirus lineage A may have been acquired from cattle or pigs. 

The potential pathogenicity and tissue tropism of ChRCoV HKU24 remains to be 
determined. While CoVs are associated with a wide spectrum of diseases in animals, some CoVs, 
especially those from bats, were detected in apparently healthy individuals without obvious signs 
of disease (8, 14, 15, 31, 33). The detection of ChRCoV HKU724 in the alimentary samples of 
Norway rats suggested possible enteric tropism. However, the three positive rats did not show 
obvious diseases. MHV, the prototype CoV most extensively studied before the SARS epidemic, 
can cause a variety of neurological, hepatic, gastrointestinal and respiratory diseases in mice, 
depending on the strain tropism and route of inoculation. The virus, originally isolated from a 
mouse with spontaneous encephalomyelitis, causes disseminated encephalomyelitis with 
extensive destruction of myelin and focal necrosis of the liver in experimentally infected mice 
(82-84). Strain MHV-AS59 is primarily hepatotropic, while strain MHV-JHM is neurotropic. 
Enterotropic strains can spread quickly as a result of high level of excretion in feces and cause 
significant environmental contamination in animal houses. Respiratory-tropic or polytropic 
strains, although uncommon, are the strains that commonly contaminate cell lines. As for RCoV, 
it causes diseases primarily in the respiratory tract, with strain sialodacryoadenitis (SDAV) being 
more associated with upper respiratory tract, salivary and lacrimal gland, and eye infections, and 
strain RCoV-Parker causing pneumonia in experimentally infected rats (85, 86). Further 
investigations are required to study the tissue tropism and pathogenicity of ChRCoV HKU24 in 


Norway rats and other potential rodent reservoirs. 
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Elucidating the receptor of ChRCoV HKU24 will be important to understand the 
mechanism of host adaptation and interspecies transmission from rodents to other mammals. The 
higher sequence identity to Betacoronavirus J than to murine coronavirus in the S protein and 
NTD of ChRCoV HKU24 is in line with other regions of the genome. Homology modelling 
showed that the conformation of the sugar binding loop in BCoV NTD is conserved in ChRCoV 
HKU24 NTD. Moreover, three of the four critical sugar-binding residues in BCoV but only two 
of the 14 contact residues at the MHV NTD/mCEACAM 1a interface are conserved in ChRCoV 
HKU24. While it remains to be ascertained if ChaRCoV HKU24 may utilize sugar or CEACAM1 
as receptor, its predicted NTD appears to resemble that of BCoV more than that of MHV. Based 
on the presence of B-sandwich fold in the NTDs of MHV and BCoV, it has been proposed that 
CoV NTDs may have originated from a host galectin with sugar-binding functions, but evolved 
new structural features in MHV for binding to CEACMAI (10, 87). If rodents are indeed the 
host origin for Betacoronavirus lineage A including Betacoronavirus 1, it would be interesting to 
study the sugar-binding activity of NTDs of different rodent BCoVs to understand their 
evolutionary history. Although some lineage A BCoVs, such as Betacoronavirus I and MHV, 
can replicate in cell lines such as BSC-1 and HRT-18 cells, attempts to isolate ChRCoV HKU24 
from the three positive samples were unsuccessful. Future studies to isolate the virus from more 


rodent samples will allow characterization of its receptor usage and pathogenicity. 
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LEGENDS TO FIGURES 

FIG 1 Comparison of genome organizations of ChRCoV HKU24, MHV, HCoV OC43 and 
HCoV HKU1. Papain-like proteases (PL1"° and PL2’"°) are represented by orange boxes. The 
residues at the cleavage site are indicated above or below the boundary of each nonstructural 
protein. Unique cleavage site in ChRCoV HKU?4 is in bold. 

FIG 2 Predicted model of ChRCoV HKU24 spike protein and NTD using Swiss-Model tool. (A) 
Predicted domain structure of ChRCoV HKU24 spike protein. NTD, N-terminal domain; RBD, 
receptor-binding domain; HR, heptad-repeat; TM, transmembrane anchor. The signal peptide 
corresponds to residues 1-15 and is cleaved during molecular maturation. (B) Sequence 
alignment of ChRCoV HKU24 NTD with BCoV, HCoV-OC43 and MHV NTD, performed 
using PROMALS3D. The three strains of ChRCoV HKU24 characterized in this study are 
bolded. Beta strands are shown as yellow arrows, and the alpha helix is shown as a coiled ribbon. 
Loop 10-11 is boxed. The 14 contact residues at the MHV NTD/mCEACAM 1a interface are 
highlighted in blue, the four BCoV critical sugar-binding residues are highlighted in brown, and 
BCoV non-critical sugar-binding residues are highlighted in yellow. Location of residue 
substitution that might decrease the sugar-binding affinity of BCoV NTD is marked by inverted 
triangle. Asterisks indicate positions that have fully conserved residues. Colons indicate 
positions that have strongly conserved residues. Periods indicate positions that have weakly 
conserved residues. (C) Predicted structure of the ChRCoV HKU24 NTD constructed through 
homology modelling from BCoV NTD (4h14) and close-up of the pocket above the B-sandwich 
core. The Global Model Quality Estimation score of 0.83 and QMEAN4 Z-score of -1.82 


indicated reliable overall model quality. 
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FIG 3 Phylogenetic analyses of RdRp, S, N and HE proteins of ChRCoV HKU24. The trees 
were constructed by the maximum likelihood method using WAG+I+G substitution model and 
bootstrap values calculated from 100 trees. Bootstrap values below 70% are not shown. Nine 
hundred and twenty-eight, 1358, 443 and 425 aa positions in RdRp, S, N and HE, respectively, 
were included in the analyses. The scale bar represents 0.3 substitutions per site. The three 
strains of ChRCoV HKU?24 characterized in this study are bolded. 

FIG 4 Estimation of tMRCA of ChRCoV HKU24 strains, BCoV/HCoV-OC43, and ChRCoV 
HKU24/members of Betacoronavirus 1/RbCoV HKU14 based on the complete RdRp and HE 
genes. The mean estimated dates (above the branch) and Bayesian posterior probabilities (below 
the branch) are labeled and are represented by gray squares. The taxa are labeled with their 
sampling dates. 

FIG 5 Western blot analysis for antibodies against purified (His)s—tagged recombinant ChRCoV 
HKU24 N protein (~SO0kDa) (A) and spike polypeptide (~50kDa) (B) in rodent serum samples 
and serum samples from other animals or humans infected by different BCoVs including HCoV 
OC43 (Betacoronavirus lineage A), RbCoV HKU14 (Betacoronavirus lineage A) and SARS- 
CoV (Betacoronavirus lineage B). Lanes: 1, negative control; 2, oriental house rat serum sample 
negative for antibody against ChRCoV HKU24 N protein and spike polypeptide; 3, Norway rat 
serum sample negative for antibody against ChRCoV HKU24 N protein and spike polypeptide; 4, 
oriental house rat serum sample positive for antibody against ChRCoV HKU24 N protein and 
spike polypeptide; 5, Norway rat serum sample positive for antibody against ChRCoV HKU24 N 
protein and spike polypeptide; 6 and 7, serum samples from rabbits infected by RoCoV HKU 14; 
8 and 9, serum samples from patients with HCoV-OC43 infection; 10 and 11, serum samples 


from patients with SARS-CoV infection; 12, positive control (anti-His antibody). 


42 


881 FIG 6 Evolution of CoVs from their ancestors in bat, bird and rodent hosts to virus species that 
882 infect other animals. The dashed arrows indicate possible routes of transmission from bats or 


883 birds to rodents before establishment of Betacoronavirus lineage A. 
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Table 1. Detection of ChRCoV HKU24 in rodents by RT-PCR and serological studies by Western blot analysis 


Scientific name 


Crocidura attenuata 


Niviventer fulvescens 


Rattus andamanensis 
Rattus norvegicus* 
Rattus norvegicus” 
Rattus rattus 


Rattus tanezumi 


Common name 


Asian gray shrew 


Chestnut white-bellied 
rat 
Indochinese forest rat 


Norway rat 
Norway rat 
Black rat 


Oriental house rat 


“Norway rats from Guangzhou 
Nn orway rats from Hong Kong 


No. of rodents tested 


308 


44 


No. (%) of rodents 


positive for ChRCoV 
HKU24 in alimentary 


samples by RT-PCR 


0/5 (0%) 
0/97 (0%) 


0/170 (0%) 
3/82 (3.6%) 
0/277 (0%) 
0/24 (0%) 
0/9 (0%) 


No. (%) of rodents 
positive for ChRCoV 
HKU24 antibody by N- 
Western blot analysis 


NA 
NA 


NA 
60/74 (81.1%) 
15/31 (48.4%) 
4/30 (0.13%) 
119 (17.8%) 


No. (%) of rodents 
positive for ChRCoV 
HKU24 antibody by 
S1-Western blot 
analysis 


NA 
NA 


NA 

21/60 (35%) 
0/15 (0%) 
0/4 (0%) 
2/7 (2.9%) 


887 Table 2. Comparison of genomic features of ChRCoV HKU24 and other CoVs with complete 
888 | genome sequences available and aa identities between the predicted chymotrypsin-like protease 
889  (3CL"°), RNA dependent RNA polymerase (RdRp), helicase (Hel), haemagglutinin-esterase 

890 (HE), spike (S), envelope (E), membrane (M) and nucleocapsid (N) proteins of ChRCoV HKU24 
891 and the corresponding proteins of other CoVs 


Coronaviruses* Genome features Pairwise amino acid identity (%) 


Size G+C ChRCoV HKU24-R050051 


(bases) content 3CL’° RdRp Hel HE S E M N 


Alphacoronavirus 


TGEV 28586 0.38 45.5 58.3 59.1 26.1 22.4 364 27.1 
MCoV 28894 0.38 465 39.3 57.2 25.8 24.7 32.0 27.6 
CCoV 29363 0.38 44.6 58.3 58.7 26.3 23.5 36.4 27.6 
FIPV 29355 0.38 45.2 584 58.7 25.6 224 33.8 26.1 
PRCV 27550 0.37 45.5 58.3 58.9 26.7 23.5 35.7 27.8 
HCoV-229E 27317 0.38 44.4 56.3 57.7 26.9 26.5 32.5 26.9 
HCoV-NL63 27553 0.34 42.8 56.5 57.6 25.8 31.0 32.8 25.4 
PEDV 28033 0.42 42.4 59.1 58.7 25.6 30.1 38.5 21.6 
Rh-BatCoV HKU2 27165 0.39 43.6 57.6 55.8 24.6 30.6 35.9 26.4 
Mi-BatCoV 1A 28326 0.38 42.4 58.0 58.4 25.1 31.3 32.9 26.9 
Mi-BatCoV HKU8 28773 0.42 43.1 58.8 56.1 25.55 29.3 35.1 26.5 
Sc-BatCoV 512 28203 0.40 41.1 583 58.1 25.2 26.8 38.0 24.9 
Ro-BatCoV HKU10 28494 0.39 43.1 56.9 57.1 26.6 345 36.2 26.4 
Hi-BatCoV HKU10 28492 0.38 43.1 56.7 57.0 25.8 34.5 35.7 25.6 


Betacoronavirus lineage A 
Betacoronavirus 1 


HCoV-OC43 30738 0.37 85.8 91.8 93.5 70.1 67.1 78.6 88.7 74.0 
BCoV 31028 0.37 86.8 926 93.7 69.6 68.0 78.6 89.2 74.9 
PHEV 30480 0.37 86.5 920 93.7 68.9 67.0 77.4 89.2 74.0 
ECoV 30992 0.37 86.8 926 94.7 66.2 69.5 76.2 85.7 73.1 
SACoV 30995 0.37 86.8 926 93.7 69.6 68.2 80.5 89.6 74.9 
CRCoV 31028 0.37 86.5 923 93.5 69.9 67.6 77.4 90.0 74.7 
GiCoV 30979 0.37 86.8 926 93.7 69.6 684 78.6 89.6 74.9 
DcCoV UAE-HKU23 31036 0.37 86.8 926 934 696 681 77.4 90.5 74.4 
Murine coronavirus 
MHV 31357 0.42 82.8 903 90.5 39.9 63.8 63.9 82.7 67.9 
RCoV 31250 0.41 82.5 903 90.5 59.3 63.3 62.5 80.5 67.5 
HCoV-HKU1 29926 0.32 82.2 88.1 88.9 50.1 60.4 53.0 78.4 62.8 
RbCoV HKU14 31084 0.38 86.8 925 947 699 67.9 74.2 91.3 73.9 
ChRCoV HKU24-R050091 = 31234 0.40 100 100 99.8 998 100 100 100 100 
ChRCoV HKU24-R05010I_ =. 31324 0.40 100 100 100 99.8 100 100 100 = 100 
Betacoronavirus lineage B 
SARS-CoV 29751 0.41 49.0 66.8 68.6 29.9 26.5 37.7 34.3 
SARSr-Rh-BatCoV HKU3 29728 0.41 48.4 66.7 68.8 29.55 26.5 38.1 34.1 
Betacoronavirus lineage C 
Ty-BatCoV HKU4 30286 0.38 52.3 68.6 68.6 33.0 25.6 42.4 36.7 
Pi-BatCoV HKU5 30488 0.43 52.00 68.6 67.1 31.4 25.6 42.9 35.9 
MERS-CoV 30107 0.41 53.3. 68.7 67.1 31.9 29.3 43.3 37.7 
Betacoronavirus lineage D 
Ro-BatCoV HKU9 29114 0.41 46.9 67.1 68.4 28.6 25.6 42.4 33.3 
Gammacoronavirus 
IBV 27608 0.38 43.9 62.0 59.8 27.2 21.6 31.5 27.6 
BWCoV SWI 31686 0.39 44.3, 60.2 57.7 25.4 24.7 26.7 29.2 
BdCoV HKU22 31759 0.39 44.3 60.6 57.9 25.2 23.1 25.1 29.2 
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Deltacoroanvirus 


BuCoV HKU11 26476 0.39 37.5 S11 48.9 26.3 25.6 28.9 24.5 
ThCoV HKU12 26396 0.38 38.0 51.8 48.4 26.2 23.6 30.6 22.1 
MunCoV HKU13 26552 0.43 38.5 53.1 50.3 26.0 21.3 28.8 21.7 
PorCoV HKU15 25421 0.43 40.4 52.2 49.0 25.6 25.3 26.9 24.2 
WECoV HKU16 26027 0.40 39.1 51.9 49.3 25.6 23.3 28.2 22.2 
SpCoV HKU17 26067 0.45 40.8 52.0 49.0 25.55 21.6 27.3 25.7 
MRCoV HKUI18 26674 0.47 38.8 51.9 49.3 26.1 22.5 28.9 23.7 
NHCoV HKU19 26064 0.38 35.2 53.7 48.0 24.2 23.9 30.8 23.1 
WiCoV HKU20 26211 0.39 369 516 48.8 26.8 28.6 27.8 23.2 
CMCoV HKU21 26216 0.35 37.6 51.6 50.2 25.1 24.7 26.1 22.2 


“TGEV, porcine transmissible gastroenteritis virus; MCoV, mink coronavirus; CCoV, canine coronavirus; FIPV, feline infectious 
peritonitis virus; PRCV, porcine respiratory coronavirus; HCoV-229E, human coronavirus 229E; HCoV-NL63, human 
coronavirus NL63; PEDV, porcine epidemic diarrhea virus; Rh-BatCoV HKU2, Rhinolophus bat coronavirus HKU2; Mi- 
BatCoV 1A, Miniopterus bat coronavirus 1A; Mi-BatCoV HKU8, Miniopterus bat coronavirus HKU8; Sc-BatCoV 512, 
Scotophilus bat coronavirus 512; Ro-BatCoV HKU10, Rousettus bat coronavirus HKU10; Hi-BatCoV HKU10, Hipposideros bat 
coronavirus HKU10; HCoV-OC43, human coronavirus OC43; BCoV, bovine coronavirus; PHEV, porcine hemagglutinating 
encephalomyelitis virus; ECoV, equine coronavirus; SACoV, sable antelope CoV; CRCoV, canine respiratory coronavirus; 
GiCoV, giraffe coronavirus; DcCoV UAE-HKU23, dromedary camel coronavirus UAE-HKU23; MHV, murine hepatitis virus; 
RCoV, rat coronavirus; HCoV-HKU1, human coronavirus HKU1; SARS-CoV, SARS coronavirus; SARSr-Rh-BatCoV HKU3; 
SARS-related Rhinolophus bat coronavirus HKU3; Ty-BatCoV HKU4, Tylonycteris bat coronavirus HKU4; Pi-BatCoV HKUS, 
Pipistrellus bat coronavirus HKUS; MERS-CoV, middle east respiratory syndrome coronavirus; Ro-BatCoV HKU9, Rousettus 
bat coronavirus HKU9; IBV, infectious bronchitis virus; BWCoV SW1, beluga whale coronavirus SW1; BdCoV HKU22, 
bottlenose dolphin coronavirus HKU22; BuCoV HKU11, Bulbul coronavirus HKU11; ThCoV HKU12, Thrush coronavirus 
HKU12; MunCoV HKU13, Munia coronavirus HKU13; PorCoV HKU15, porcine coronavirus HKU15; WECoV HKU 16, white- 
eye coronavirus HKU16; SpCoV HKU17, sparrow coronavirus HKU17; MRCoV HKU18, magpie robin coronavirus HKU18; 
NHCoV HKU19, night heron coronavirus HKU19; WiCoV HKU20, wigeon coronavirus HKU20; CMCoV HKU21, common 
moorhen coronavirus HKU21. 
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909 ‘Table 3. Coding potential and predicted domains in different proteins of ChRCoV HKU24 


ORFs Nucleotide No. of No. of Frame Putative function or domain* Positions (aa) Putative TRS 
positions nucleotides amino 
(start-end) acids 
Nucleotide position in TRS sequence 
genome (distance in bases to 
AUG)" 
lab 213-21637 21425 7141 +3,42 63 CUAAAC(144)AUG 
nspl 213-950 738 246 +3 Unknown 1-246 
nsp2 951-2714 1764 588 +3 Unknown 247-834 
nsp3 2715-8603 5889 1963 +3 Acidic domain, Hydrophobic domain, 835-2797 
ADRP, Putative PL” domain PL1””, 
Pies 
nsp4 8604-10091 1488 496 +3 Hydrophobic domain 2798-3293 
nsp5 10092-11000 909 303 +3 3CLP° 3294-3596 
nsp6 11001-11861 861 287 +3 Hydrophobic domain 3597-3883 
nsp7 11862-12128 267 89 +3 Unknown 3884-3972 
nsp8 12129-12719 591 197 +3. Unknown 3973-4169 
nsp9 12720-13049 330 110 +3. Unknown 4170-4279 
nsp10 =: 13050-13460 411 137 +3. Unknown 4280-4416 
nspl1 13461-13505 45 14 +3. Unknown (short peptide at the end of 4417-4430 
ORFla) 
nsp12 13461-16243 2783 928 +2 RdRp 4417-5344 
nsp13 16244-18042 1797 599 +2 Hel 5345-5943 
nspl4 18041-19603 1563 521 +2 ExoN, N7-MTase 5944-6464 
nsp15 19604-20728 1125 375 +2. NendoU 6465-6839 
nspl16 20729-21637 909 302 +2. O-MT 6840-7141 
NS2a_ 21639-22469 =. 831 276 43 21629 CUAAAC(4)AUG 
HE 22484-23761 1278 425 +2 Hemagglutinin domain 129-266 22466 CUGAAC(12)AUG 
Cleavage site Between | and 18 
Active site for neuraminate O-acety]- 38-41 
esterase activity, FGDS 
Ss 23777-27853 4077 1358 +2 Type I membrane glycoprotein 23771 CUAAACAUG 
N terminal domain 16-299 
Cleavage site Between 763 and 
764 
2 heptad repeats 1045-1079 (AR1), 


1253-1285 (HR2) 
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Transmembrane domain 1303-1325 
Cytoplasmic tail rich in cysteine residues 


NS4 27946-28356 411 136 +1 Transmembrane domain 7-29 

NS5 =. 28338-28652 315 104 43 28286 GUAAAC(46)AUG 

E 28645-28893 249 82 +1 2 transmembrane domains 13-37 and 38-82 

M 28908-29603 696 231 +3 3 transmembrane domains 26-45, 50-72 and 28899 CUAAAC(3)AUG 
79-101 

N2 29596-30288 693 230 +1 

N 29613-30944 1332 443 +3 29600 CUAAAC(7)AUG 


910 *ADRP: adenosine diphosphate-ribose 1’’-phosphatase; PL1’, PL2"”: Papain-like protease 1 and papain-like protease 2; 3CL”®: 3C-like protease; 
911 RdRp: RNA-dependent RNA polymerase; Hel: Helicase; ExoN: 3’-to-5’ exonuclease; N7-MTase, (guanine-N7)-methyltransferase; NendoU, 

912 _ nidoviral uridylate-specific endoribonuclease; O-MT: 2'-O-ribose methyltransferase. 

913 Boldface indicates putative TRS sequences. 
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914 Table 4. Cleavage site used between nsps in lineage A betacoronaviruses 


ChRCoV HKU24* _ Betacoronavirus 1 RbCoV HKUI4 MHV RCoV HCoV-HKU1 


nsp|Insp2 GIL GIV GIV GIV GIV GII 
nsp2Insp3 AIG AIG AIG AIG AIG AIG 
nsp3insp4 GIA GIA GIA GIA GIA GIV 
nsp4insp5 Qis Qis Qis Qis Qis Q's 
nsp5Insp6 Qis Qis Qis Qis Qis Qis 
nsp6Insp7 Qis Qis Qis Qis Qis Q's 
nsp7Insp8 QIA QIA QIA QIA HIA QIA 
nsp8Insp9 QIN QIN QIN QIN QIN QIN 
nsp9Insp10 QIA QIA QIA QIA QIA QIA 
nsp10Insp12 Qis Qis Qis Qis Qis Qis 
nsp12Insp13.—s QIS Qis Qis Qis Qis QIs 
nsp13Insp14.— QIC Qic Qic Qic Qic HIC 
nspl4insp15 = QIS Qis Qis QIs Qis QIs 
nsp|5Inspl6— QIA QIA QIA QIA QIA QIA 

915 “Unique cleavage site in ChaRCoV HKU24 is in bold. 

916 

917 
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918 
919 


Table 5. Pairwise comparisons of Coronaviridae-wide conserved domains in replicase 
polyprotein lab between ChRCoV HKU24 and other lineage A betacoronaviruses 


Pairwise amino acid identity of ChRCoV HKU24 (%) 


Replicase polyprotein Betacoronavirus 1 RbCoV Murine HCoV-HKU1 
domains HKU14 coronavirus 

nsp3 (ADRP) 74.8-81.7 74.8 69.5-70.2 71 

nsp5 (3CL?”) 85.8-86.8 86.8 82.5-82.8 82.2 

nsp12 (RdRp) 91.8-92.6 92.5 90.3 88.1 

nsp13 (Hel) 93.4-94.8 94.7-94.8 90.5-90.7 88.9-89.1 
nsp14 (ExoN) 86.4-88.7 88.7 83.9-84.1 80.2 

nsp15 (NendoU) 77.6-79.2 79.5 72.0-73.6 70.1 

nsp16 (O-MT) 88.7-89.7 89.1 83.8-85.1 84.1 
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